Approximate Pattern Matching Using the Burrows-Wheeler Transform

نویسندگان

  • Nan Zhang
  • Amar Mukherjee
  • Donald A. Adjeroh
  • Timothy C. Bell
چکیده

The compressed pattern matching problem is to locate the occurrence(s) of a pattern P in a text string T, using a compressed representation of T, with minimal (or no) decompression. In this paper, we consider approximate pattern matching on the text transformed by the Burrows-Wheeler Transform (BWT). This is an important first step towards developing compressed pattern matching algorithm for BWT based compression system. Algorithms are proposed that solve the k-mismatch problem in

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximate Pattern Matching Over the Burrows-Wheeler Transformed Text

The compressed pattern matching problem is to locate the occurrence(s) of a pattern P in a text string T using a compressed representation of T , with minimal (or no) decompression. In this paper, we consider approximate pattern matching directly on Burrow-Wheeler transformed (BWT) text which is a critical step for a fully compressed pattern matching algorithm on a BWT based compression algorit...

متن کامل

Compressed-Domain Pattern Matching with the Burrows-Wheeler Transform

This report investigates two approaches for online pattern-matching in files compressed with the Burrows-Wheeler transform (Burrows & Wheeler 1994). The first is based on the Boyer-Moore pattern matching algorithm (Boyer & Moore 1977), and the second is based on binary search. The new methods use the special structure of the BurrowsWheeler transform to achieve efficient, robust pattern matching...

متن کامل

DNA Sequence Compression Using the Burrows-Wheeler Transform

We investigate off-line dictionary oriented approaches to DNA sequence compression, based on the Burrows-Wheeler Transform (BWT). The preponderance of short repeating patterns is an important phenomenon in biological sequences. Here, we propose off-line methods to compress DNA sequences that exploit the different repetition structures inherent in such sequences. Repetition analysis is performed...

متن کامل

Wheeler Graphs: Variations on a Theme by Burrows and Wheeler

The famous Burrows-Wheeler Transform was originally defined for single strings but variations have been developed for sets of strings, labelled trees, de Bruijn graphs, alignments, etc. In this talk we propose a unifying view that includes many of these variations and that we hope will simplify the search for more. Somewhat surprisingly we get our unifying view by considering the Nondeterminist...

متن کامل

Searching for Unique DNA Sequences with the Burrows-Wheeler Transform

The objective of this study was to present an efficient algorithm that effectively aids the problem of searching for unique DNA sequences in the set of genes. The presented algorithm is based on the Burrows-Wheeler Transform (BWT), a very fast and effective data compression algorithm. The developed algorithm exploits all the advantages offered by the BWT algorithm and the suffix array data stru...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003